AITopics | privacy requirement

Collaborating Authors

privacy requirement

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When Data Can't Meet: Estimating Correlation Across Privacy Barriers

Neural Information Processing SystemsJun-19-2026, 10:57:11 GMT

We consider the problem of estimating the correlation of two random variables X and Y, where the pairs (X, Y) are not observed together, but are instead separated co-ordinate-wise at two servers: server 1 contains all the X observations, and server 2 contains the corresponding Y observations. In this vertically distributed setting, we assume that each server has its own privacy constraints, owing to which they can only share suitably privatized statistics of their own component observations. We consider differing privacy budgets (ε1, δ1) and (ε2, δ2) for the two servers and determine the minimax optimal rates for correlation estimation allowing for both noninteractive and interactive mechanisms. We also provide correlation estimators that achieve these rates and further develop inference procedures, namely, confidence intervals, for the estimated correlations. Our results are characterized by an interesting rate in terms of the sample size n, ε1, ε2, which is strictly slower than the usual central privacy estimation rates. More interestingly, we find that the interactive mechanism is always better than its non-interactive counterpart whenever the two privacy budgets are different. Results from extensive numerical experiments support our theoretical findings.

artificial intelligence, estimator, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Have it your way: Individualized Privacy Assignment for DP-SGD

Neural Information Processing SystemsFeb-10-2026, 16:46:03 GMT

Machine learning (ML) models are known to leak information about their training data.

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Oregon > Multnomah County > Portland (0.04)
Europe > Germany (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

A Privacy-Preserving Federated Learning Method with Homomorphic Encryption in Omics Data

Negoya, Yusaku, Cui, Feifei, Zhang, Zilong, Pan, Miao, Ohtsuki, Tomoaki, Li, Aohan

arXiv.org Artificial IntelligenceNov-11-2025

Omics data is widely employed in medical research to identify disease mechanisms and contains highly sensitive personal information. Federated Learning (FL) with Differential Privacy (DP) can ensure the protection of omics data privacy against malicious user attacks. However, FL with the DP method faces an inherent trade-off: stronger privacy protection degrades predictive accuracy due to injected noise. On the other hand, Homomorphic Encryption (HE) allows computations on encrypted data and enables aggregation of encrypted gradients without DP-induced noise can increase the predictive accuracy. However, it may increase the computation cost. To improve the predictive accuracy while considering the computational ability of heterogeneous clients, we propose a Privacy-Preserving Machine Learning (PPML)-Hybrid method by introducing HE. In the proposed PPML-Hybrid method, clients distributed select either HE or DP based on their computational resources, so that HE clients contribute noise-free updates while DP clients reduce computational overhead. Meanwhile, clients with high computational resources clients can flexibly adopt HE or DP according to their privacy needs. Performance evaluation on omics datasets show that our proposed method achieves comparable predictive accuracy while significantly reducing computation time relative to HE-only. Additionally, it outperforms DP-only methods under equivalent or stricter privacy budgets.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.06064

Country:

Asia > Japan (0.28)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Have it your way: Individualized Privacy Assignment for DP-SGD

Neural Information Processing SystemsOct-8-2025, 12:30:36 GMT

Machine learning (ML) models are known to leak information about their training data.

dp-sgd, privacy budget, privacy group, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Oregon > Multnomah County > Portland (0.04)
Europe > Germany (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Optimal Client Sampling in Federated Learning with Client-Level Heterogeneous Differential Privacy

Xu, Jiahao, Hu, Rui, Kotevska, Olivera

arXiv.org Artificial IntelligenceSep-8-2025

Federated Learning with client-level differential privacy (DP) provides a promising framework for collaboratively training models while rigorously protecting clients' privacy. However, classic approaches like DP-FedAvg struggle when clients have heterogeneous privacy requirements, as they must uniformly enforce the strictest privacy level across clients, leading to excessive DP noise and significant model utility degradation. Existing methods to improve the model utility in such heterogeneous privacy settings often assume a trusted server and are largely heuristic, resulting in suboptimal performance and lacking strong theoretical underpinnings. In this work, we address these challenges under a practical attack model where both clients and the server are honest-but-curious. We propose GDPFed, which partitions clients into groups based on their privacy budgets and achieves client-level DP within each group to reduce the privacy budget waste and hence improve the model utility. Based on the privacy and convergence analysis of GDPFed, we find that the magnitude of DP noise depends on both model dimensionality and the per-group client sampling ratios. To further improve the performance of GDPFed, we introduce GDPFed$^+$, which integrates model sparsification to eliminate unnecessary noise and optimizes per-group client sampling ratios to minimize convergence error. Extensive empirical evaluations on multiple benchmark datasets demonstrate the effectiveness of GDPFed$^+$, showing substantial performance gains compared with state-of-the-art methods.

artificial intelligence, gdpfed, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2505.13655

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

P3SL: Personalized Privacy-Preserving Split Learning on Heterogeneous Edge Devices

Fan, Wei, Yoon, JinYi, Li, Xiaochang, Shao, Huajie, Ji, Bo

arXiv.org Artificial IntelligenceAug-6-2025

Abstract--Split Learning (SL) is an emerging privacy-preserving machine learning technique that enables resource constrained edge devices to participate in model training by partitioning a model into client-side and server-side sub-models. While SL reduces computational overhead on edge devices, it encounters significant challenges in heterogeneous environments where devices vary in computing resources, communication capabilities, environmental conditions, and privacy requirements. Although recent studies have explored heterogeneous SL frameworks that optimize split points for devices with varying resource constraints, they often neglect personalized privacy requirements and local model customization under varying environmental conditions. T o address these limitations, we propose P3SL, a Personalized Privacy-Preserving Split Learning framework designed for heterogeneous, resource-constrained edge device systems. The key contributions of this work are twofold. First, we design a personalized sequential split learning pipeline that allows each client to achieve customized privacy protection and maintain personalized local models tailored to their computational resources, environmental conditions, and privacy needs. Second, we adopt a bi-level optimization technique that empowers clients to determine their own optimal personalized split points without sharing private sensitive information (i.e., computational resources, environmental conditions, privacy requirements) with the server. We implement and evaluate P3SL on a testbed consisting of 7 devices including 4 Jetson Nano P3450 devices, 2 Raspberry Pis, and 1 laptop, using diverse model architectures and datasets under varying environmental conditions. Experimental results demonstrate that P3SL significantly mitigates privacy leakage risks, reduces system energy consumption by up to 59.12%, and consistently retains high accuracy compared to the state-of-the-art heterogeneous SL system. T o protect data privacy, some research has proposed training entire machine learning models to process data locally [5]. However, training entire ML models on resource-constrained edge devices presents significant challenges, including high energy consumption and prolonged training durations.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2507.17228

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Generating Privacy Stories From Software Documentation

Baldwin, Wilder, Chintakuntla, Shashank, Parajuli, Shreyah, Pourghasemi, Ali, Shanz, Ryan, Ghanavati, Sepideh

arXiv.org Artificial IntelligenceJul-1-2025

--Research shows that analysts and developers consider privacy as a security concept or as an afterthought, which may lead to non-compliance and violation of users' privacy. Most current approaches, however, focus on extracting legal requirements from the regulations and evaluating the compliance of software and processes with them. In this paper, we develop a novel approach based on chain-of-thought prompting (CoT), in-context-learning (ICL), and Large Language Models (LLMs) to extract privacy behaviors from various software documents prior to and during software development, and then generate privacy requirements in the format of user stories. Our results show that most commonly used LLMs, such as GPT -4o and Llama 3, can identify privacy behaviors and generate privacy user stories with F1 scores exceeding 0.8. We also show that the performance of these models could be improved through parameter-tuning. Our findings provide insight into using and optimizing LLMs for generating privacy requirements given software documents created prior to or throughout the software development lifecycle. Understanding the privacy behaviors of software applications and eliciting privacy requirements during the early phases of the software development lifecycle (SDLC) are essential for developing privacy-preserving and regulatory-compliant software [1], [2]. Past research, however, shows that software analysts and developers often consider privacy as a subset of security requirements or as an afterthought [3], [4], and they often lack the tools needed to understand and identify privacy behaviors of the applications they develop [5], [6]. Most common approaches for identifying and eliciting privacy requirements include conducting privacy impact assessments [7], [8], or employing goal-oriented methodologies to map privacy requirements to system processes [8]-[10]. Other works aim to extract privacy-related information from user stories or use case models [11]-[17] by leveraging Natural Language Processing (NLP) techniques and then using predefined templates to generate privacy requirements. However, these approaches mostly focus on the specific forms of software documentation (i.e., user stories or use cases), or they rely on developers to understand how personal information is handled by their applications.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2506.23014

Country:

North America > United States > Maine > Penobscot County > Orono (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Optimizing QoE-Privacy Tradeoff for Proactive VR Streaming

Wei, Xing, Han, Shengqian, Yang, Chenyang, Sun, Chengjian

arXiv.org Artificial IntelligenceMar-12-2025

Proactive virtual reality (VR) streaming requires users to upload viewpoint-related information, raising significant privacy concerns. Existing strategies preserve privacy by introducing errors to viewpoints, which, however, compromises the quality of experience (QoE) of users. In this paper, we first delve into the analysis of the viewpoint leakage probability achieved by existing privacy-preserving approaches. We determine the optimal distribution of viewpoint errors that minimizes the viewpoint leakage probability. Our analyses show that existing approaches cannot fully eliminate viewpoint leakage. Then, we propose a novel privacy-preserving approach that introduces noise to uploaded viewpoint prediction errors, which can ensure zero viewpoint leakage probability. Given the proposed approach, the tradeoff between privacy preservation and QoE is optimized to minimize the QoE loss while satisfying the privacy requirement. Simulation results validate our analysis results and demonstrate that the proposed approach offers a promising solution for balancing privacy and QoE.

prediction error, probability, viewpoint, (13 more...)

arXiv.org Artificial Intelligence

2503.09448

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Federated Learning With Individualized Privacy Through Client Sampling

Lange, Lucas, Borchardt, Ole, Rahm, Erhard

arXiv.org Artificial IntelligenceJan-29-2025

With growing concerns about user data collection, individualized privacy has emerged as a promising solution to balance protection and utility by accounting for diverse user privacy preferences. Instead of enforcing a uniform level of anonymization for all users, this approach allows individuals to choose privacy settings that align with their comfort levels. Building on this idea, we propose an adapted method for enabling Individualized Differential Privacy (IDP) in Federated Learning (FL) by handling clients according to their personal privacy preferences. By extending the SAMPLE algorithm from centralized settings to FL, we calculate client-specific sampling rates based on their heterogeneous privacy budgets and integrate them into a modified IDP-FedAvg algorithm. We test this method under realistic privacy distributions and multiple datasets. The experimental results demonstrate that our approach achieves clear improvements over uniform DP baselines, reducing the trade-off between privacy and utility. Compared to the alternative SCALE method in related work, which assigns differing noise scales to clients, our method performs notably better. However, challenges remain for complex tasks with non-i.i.d. data, primarily stemming from the constraints of the decentralized setting.

artificial intelligence, deep learning, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2501.17634

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany > Saxony > Leipzig (0.07)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Privacy in Metalearning and Multitask Learning: Modeling and Separations

Aliakbarpour, Maryam, Bairaktari, Konstantina, Smith, Adam, Swanberg, Marika, Ullman, Jonathan

arXiv.org Artificial IntelligenceDec-16-2024

Model personalization allows a set of individuals, each facing a different learning task, to train models that are more accurate for each person than those they could develop individually. For example, consider a set of people, each of whom holds a relatively small dataset of photographs labeled with the names of their loved ones that appear in each picture. Each person would like to build a classifier that labels future pictures with the names of people in the picture, but training such an image classifier would take more data than any individual person has. Even though the tasks they want to carry out are different--their photos have different subjects--those tasks share a lot of common structure. By pooling their data, a large group of people could learn the shared components of a good set of classifiers. Each individual could then train the subject-specific components on their own, requiring only a few examples for each subject. Other applications of personalization include next-word prediction on a mobile keyboard, speech recognition, and recommendation systems. The goals of personalization are captured in a variety of formal frameworks, such as multitask learning and metalearning.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.12374

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.86)

Add feedback